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Self-organization of orientation-wheels observed in the visual cortex is dis- 
cussed from the view point of topology. We argue in a generalized model 
of Kohonen's feature mappings that the existence of the orientation-wheels 
is a consequence of Riemann-Hurwitz formula from topology. In the same 
line, we estimate partition function of the model, and show that regardless of 
the total number iV of the orientation-modules per hypercolumn the modules 
are self-organized, without fine-tuning of parameters, into definite number of 
orientation-wheels per hypercolumn if N is large. 



Keywords — Orientation columns, Visual cortex, Self-organizing feature maps, 
Neural networks, Orientation singularities, Riemann-Hurwitz theorem, Spontaneous 
symmetry breaking, Topological maps 



Local organization of orientation columns 



2 



1 Introduction 

Among various columns observed in cortical surface, orientation columns in primary 
visual area VI exhibit one of the most attractive structural organization. Here ex- 
panding our previous study (Yamagishi, 1994, referred to as I hereafter), we would 
like to investigate the mechanism of the self-organization of these orientation columns 
from the view point of spontaneous symmetry breaking. In so doing we hope that we 
will get deeper understanding of the role of these columns not only in the context of 
self-organization but also in the information processing problem in visual cortex. 

Since the first discovery by Hubel and Wiesel (1977 for a survey), these distributed 
localized area of iso-orientation preference have attracted people's attention. In their 
earlier model (see Figure 1), based on their experimental data, although they are not 
conclusive, the columns are considered (proposed) as some form of slabs embedded 
between ocular dominance boundary walls. Each slab is assigned detectors with 
specific orientation preference. However, in the recent experiments, as more precise 
data have become available, more detailed structure of this organization is becoming 
to emerge. 
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Figure 1. Ice-cube model of orientation columns by Hubel and Wiesel. Orientation columns arc 
regularly embedded in R, L ocular dominance area. 

In one of the most recent data Blasdel (1992; see also Obermayer et al., 1993) 
showed in higher precision that the orientation columns are basically organized ac- 
cording to two classes of structural basis: point singularities and fractures. The 
general feature near the ocular dominance boundary wall still remain the same as 
in the previous results; the orientation and ocular dominance walls intersect in right 
angles. It has also been shown that these structural bases — singular points and/or 
fractures — typically stay between the ocular dominance walls with the average num- 
ber (3 ~ 4) per hypercolumn. This structure constitutes a unit module (approxi- 
mately lmmxlmm square region on the cortical surface in the case of monkeys) that 
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contains a right set of feature-detector cells of color-, orientation-, spatial frequency- 
information, etc. for an associated segment of visual field (corresponding to 0.8° x 0.8° 
near fovea region). 

The purpose of our study is first to consider the precise meaning of modular 
organization where the feature space consists of continuous (or infinite) degrees of 
freedom like in this case. In principle, we would need infinite number of column- 
modules if we enforce modular organization a la Hubel and Wiesel in full precision. 
At the same time we do not know the minimum resolution of orientation angles, or 
even whether such concept at the orientation-column level is meaningful or not. In 
the next section after reviewing some morphology of this columnar organization, we 
propose a "statistical definition of orientation modules" to address this. There the 
module boundaries are not defined as definite objects as in classical picture of Hubel 
and Wiesel, rather they are defined only statistically. 

Quantitative analysis of this scheme is made in §§3-4. We use generalized Koho- 
nen's feature mapping algorithm (Kohonen, 1982, 1989) to check this idea. Models of 
the same class have been used before for the ocular dominance columns (Obermayer et 
ai., 1992, 1990; Yamagishi 1994). In our model (I) we added an extra one-dimensional 
(feature) space to the ordinary Kohonen that is frequently used in two-dimensional 
self-organization problem on cortical surface, and analyzed its continuum limit with 
respect to spacing between neurons. Introducing a certain regularization method to 
handle rather singular "winner-takes- all" mechanism, we derived that the lowest sta- 
ble state is a solitonic kink-anti-kink state corresponding to the ocular dominance 
domain-walls. Here we follow basically the same strategy. Added feature space, how- 
ever, is not a line interval [—1,1] but a compact space S 1 including its interior that 
represents the orientation degrees of freedom with each probability indicated by the 
radial distance from the center of the S l . [The center of this feature space represents 
"no orientation preference".] In order to derive a global configuration of the orien- 
tation columns, we would need a full fledged system that includes both ocular and 
orientation degrees of freedom. In this paper, however, we would like to address more 
fundamental questions: why the orientation singularities appear, and how these are 
related to the (statistical definition of) orientation modules proposed in this paper. 
For the latter purpose only, we do not need feature space terms related to the ocular 
dominance columns. 

We restrict our model in one hypercolumn area only. [We would like to ad- 
dress global organization problem in future publications.] The solitonic states in 
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this case correspond basically to harmonic maps with respect to retinotopic coordi- 
nates. In two-dimensions (and in a simple topology) they are just linear combina- 
tions of holomorphic and anti-holomorphic mappings. As an analytic continuation 
from global topographic mapping of retina to visual cortex, which is roughly approx- 
imated by a holomorphic (and anti-holomorphic) mapping, we choose holomorphic 
mappings to describe orientation columns in one ocular domain of the hypercolumn 
(and anti-holomorphic ones for the other ocular domain). In modular organization 
these mappings are basically one-to-many. (The image in each orientation column 
is a replication of the original one mapped from retina onto hypercolumn region in 
the layer 4C.) Quite generally in a holomorphic mapping from one compact space to 
another compact space, this global information (multiplicity or "mapping degree") 
gives rise to some restriction on the local structure of the mapping, i.e., existence and 
number of singularities. We will show that the orientation singularities correspond to 
these singularities. We would like to explain this point in detail in §3. 

In §4, we derive statistical distribution function (partition function) of the above 
mappings in the path integral formalism. (See e.g. our previous work I.) The result 
is written in a summation with respect to the mapping degree and the number of 
singularities. We will see that this precisely corresponds to our statistical definition 
of the orientation modules. This is because the size (accordingly the iso-orientation 
width) of the module changes according to the selection of the mapping degrees from 
the layer 4C to the layers 2+3; the mapping degree is large if the iso-orientation width 
is narrow, and is small in the converse case. The partition function contains all these 
contributions in the statistical sum. 

One interesting result in our calculation is that this partition function takes max- 
imum value at finite number of singularities regardless of the mapping degree if the 
latter is large enough. This implies that in average there exist stable, definite number 
of orientation wheels in the hypercolumn anywhere on the cortical surface, regardless 
of the size of the orientation modules. That agrees with what actually observed on 
the cortical surface. 

Finally we discuss some future issues in §5. 

2 Definition of orientation modules 

In this section we present our precise definition of orientation modules, starting with 
their fundamental anatomical descriptions on which our theory is based. 
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Our main focus in this paper is the orientation columns in layers 2+3 observed in 
macaque visual cortex[]. Major inputs to neurons in these orientation columns come 
from layer 4|. Typical dendrites of the orientation detector neurons extend 200 ~ 
500/iin, that is about the scale of the ocular dominance width, in all directions almost 
uniformly. Lateral spread of the interlaminar axons from layer 4C is 300/xm ~ 400/im 
right before the dendritic input into cells in layers 2+3. Thus precise topographic 
mapping from retina established in the layer 4C is maintained there only in the scale 
larger than hypercolumn size (Fitzpatric et al., 1985). Some like-orientation columns 
in neighboring hypercolumns are interconnected like clusters via rather long (a few 
mm up to several mm) collateral axonal arbors from each orientation cell across the 
ocular walls (Gilbert and Wiesel, 1983, 1989). Some cross-orientation (inhibitory) 
interactions between such clusters are also observed (Matsubara et al., 1987; McGuire 
et al, 1991). 

The receptive field profile of each constituent (simple) orientation detector cell 
ranges typically around 0.5° x 0.5° ~ 1.5° x 1.5° at central visual field. That has been 
compared to 2D Gabor filters, and with an appropriate adjustment of parameters a 
good agreement is demonstrated (Jones and Palmer, 1987) 

In recent high precision experimental data (Blasdel, 1992), observed iso-orientation 
domains are organized based on two structural centers: point singularities and frac- 
tures. Distribution of the iso-orientation domains around the singular point is either 
increasing or decreasing with respect to the orientation preference angles. In aver- 
age, those clock-wise or anti-clock wise orientation centers distribute evenly in every 
hypercolumn. 

Statistically among clock wise (or anti-clock wise) orientation centers the posi- 
tion of an orientation preference within the orientation wheels seems to be at random, 
namely, in average one orientation preference (say, 3:00 direction in visual field) occurs 
at any angle in the orientation wheels on the cortical surface, although the orientation- 
preference changes monotonically increasing or decreasing within one wheel (with a 
certain increment depending on the experimental set-up). It should also be empha- 

^Localized areas of orientation preference are also observed in other layers (5, 6) and visual area 
(e.g. V2 (area 18) Bonhoeffer et al, 1993). However, we would like to restrict our theory only to 
layers 2+3 for the sake of precaution. 

■I-Some from 4A, 4B, and some directly from 4Ca, 4C/3 and LGN. 

*These results are not conclusive. Other possibilities are also known (e.g. Hawken and Parker, 
1987). 
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sized that the iso-orientation domain is not separated into more than two pieces within 
one wheel — it occurs only once. This plays important role later. 

This monotonic change of orientation preference is interrupted and cut (or stop- 
ped) by "fractures" where a portion of iso-orientation domains from other wheels 
intersects. 

The shape of the iso-orientation domains in each wheel does not seem to be 
definitive. It tends to be elongated in the direction of the radial direction from the 
singularity on the cortical surface, but is not definitive. It involves some artifact, 
i.e., how one defines iso-orientation width. There is almost no correlation between 
those shapes and preferred orientation directions of the cells as well as positions of 
the iso-orientation domains in the wheel. 

From these observations we can safely conclude the followings. Firstly, the fil- 
tering mechanism of each orientation information, via e.g. 2D-Gabor filter-like struc- 
tures, seems to be independent of the self-organization mechanism (since the shape 
of the iso-orientation domain is rather arbitrary and its position within orientation 
wheels does not depend on the orientation preference of the visual field). 

Secondly, it looks quite unlikely that total monotonic change of iso-orientation 
preferences ranging over 180° in one orientation wheel plays any essential role in the 
orientation detection (since the fractures can interrupt it arbitrarily). 

Therefore, this self-organization is something that only assembles together the 
cells of similar functionality semi-locally but with no definite global order . The exis- 
tence of the singularities itself looks mere consequence of this organization mechanism 
rather than some immediate necessity for information processing strategy of orien- 
tations. In contrast, the monotonic change (either just increasing or decreasing, but 
not like increasing and then decreasing) of the orientation preference in the semi-local 
region seems to play important role. 

Now, based upon these observations, in the remainder of this section we would 
like to present our definition of the "modular organization" in the case at hand, where 
feature space consists of some continuous degrees of freedom. In the original idea of 
Hubel and Wiesel, each orientation slab, the "module" of the orientation detection, 
was assumed to receive replicated (but somewhat distorted) image projected over the 
hypercolumn size region where it resides, and each of the slabs was assigned detectors 
with a specific orientation preference. Then those orientation slabs associated to 
various possible orientation directions were supposed to fill up the entire hypercolumn 
region as in Figure 1. In this way, 3-dimensional information (i.e., a totality of 
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"orientations at every image points") was considered to be processed. However, the 
"orientation" has full continuous (infinite) degrees of freedom. If we enforce this 
scheme in full precision, we need infinite number of orientation columns and nerve 
cells. 

However, we also know the following fact. If one tries to detect an orientation 
precisely, one has to sacrifice positional precision that is parallel to that orientation 
direction (Daugmann, 1985). This holds not only in each detector cell level but in 
general. Hence in the limit of the finest precision of the orientation the projected 
image of the hypercolumn eventually reduces to one-dimensional degrees of freedom 
perpendicular to the orientation direction. Namely, in the detectable information 
space, the information is not fully 3-dimensional one. We need a certain trade-off 
between the orientation and the positional precision parallel to it. 

Taking into account this, we consider here the information-processing scheme 
and its associated modular organization as in the following. "The module of iso- 
orientation (and its boundary) in the classical sense of Hubel and Wieselj ^ is not def- 
inite" . Namely, we propose that the extent of the module virtually changes according 
to the image one is concentrating on. If one tends to focus on (via higher corti- 
cal function) information about the sharpness of an orientation direction of an edge 
rather than whose position (e.g. case of a long edge), then only the outputs from 
those detector cells that fall into quite narrow range of orientation are selected as if 
those cells constitute one "orientation module" associated with that orientation. In 
that case the "orientation module" encompasses a very thin domain of iso-orientation 
region. On the contrary in the case one tends to extract informations about orienta- 
tion with moderate precision at a special, precise point (e.g. very short edge — say, 
less than a quarter degree ~ lmm@25cm-distance sight) in a visual field, then the 
informations from moderately narrow range of orientation preferences are selected 
as if those orientation cells constitute one orientation module. Namely, the orien- 
tation module is dynamically configured, and there is no definite boundary between 
orientation modules. 

State it differently, here the size of the orientation module is statistically defined. 
There is no definite (or classical) modular organization in the sense of Hubel and 
Wiesel. 

■^Here the "global order" of the iso-orientation slabs as seen in Figure 1 is not the point of our 
attention. 
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To achieve this multiple projection scheme, almost all the orientation detector 
cells have to receive inputs from all axonal arbors projected from the layer 4 within 
the same hypercolumn that they belong to. The results from observation on axon 
morphology, as we briefly sketched before about layers 4C and its superficial laminae, 
are not inconsistent with this hypothesis. 

The unexpected support for this scheme actually comes from the real organization 
of the orientation columns. The necessary organizational structure to attain this 
scheme is firstly the semi-local continuity of the orientation preference in a size no 
larger than, say, 30 degrees of orientation width. We do not need global or 180° 
continuity. But most important, among other things, is a forgetful, simple fact that 
the iso-orientation patch should not be separated into more than two pieces in one 
orientation wheel, namely, the preference angle should not change in such a fashion 
that it increases and then decreases continuously. At least from one fracture to the 
next fracture the preference angle has to change monotonically within one orientation 
wheel, either increasing or just decreasing, otherwise our scheme does not work. 

This scheme itself does not exclude global order of orientation domains, e.g., the 
organization shown in Figure 1. But we do not need such high global order — semi- 
local order is enough to attain this scheme, and that is more plausible if the cells are 
self-organized on a non-genetic basis. 

3 Role of holomorphicity 

In this section, we would like to present our model to describe local-organizations and 
formulations of our orientation modules. We use a model basically of the same class 
as in our previous work I, i.e., a generalization of Kohonen's self-organizing feature 
maps. To establish notations, let us briefly recall its formulations. 

3.1 The model 

Let w(r, t) be a 2-component vector-valued function defined on every cortical point f 
at time step r. We assume that each w(f, r) represent a point in retina's field E. Then 
the set of vectors {w{f, r)\r e F = a set of cortical points} defines neural connections 
between the retinal surface and the visual cortex, and gives rise to a mapping of retinal 
image onto the visual cortex. Here, in addition, we consider a complex-valued function 
wo{r,r) (where \wo(f, r)| < 1) that belongs to a hypothetical orientation feature- 
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space D 2 , a flat disc of radius 1 with its boundary circle S 1 . We can think of u>e>(r, r) G 
D 2 as indicating a preference of the orientation detector cell placed at a cortical 
point r, whose preferred orientation angle is directed to arg[u>e>(r, r)] with probability 
\ w o(f, t)\. (Namely, the cell has a polarization |u>e>(r, r)| in the preferred orientation 
arg[iu£)(r, r)]. We assume no orientation preference if |u>e>(r, r)| = 0.) Then the pair 
(w(r,T),wo(r,r)) represents a retinal information mapped to the cortical point r 
together with the orientation-preference at that point. 

We update this system using the following learning rule. We start with a random 
configuration of w = (w(f,T),wo{r,T)) at the time step r = 0, and repeat the two- 
step process stated below upon every presentation of the data £ = (£*, £e>) EE® S 1 . 
(£e> is a complex number of modulus 1 specified later.) The first step is a so-called 
"winner-takes-all" : 

[UPDT:1] Select the winning point f* on the cortical surface where the retinal vector 

— * 

w* = (w(f*,r),wo(r*,T)) is closest to the data £ = (£, £o) presented at the 
time r. 

Here the (distance) 2 between two points w a , w b in the feature-data space is defined 
as 

d(w a ,w b ) = \ \w a - w b \\ 2 + alh(w 0a ,u!ob)- (1) 

The second term that indicates a perturbative effect from the ordinary Kohonen 
becomes significant only in the scale smaller than a®. We take o"o the typical size of 
orientation columns, which is far smaller than retinotopic organization (~ scale of 
the visual area VI). We also assume expansion: 

h(w 0a ,wob) = 9i\wo a - w 0b \ 2 + 92\w 0a - w 0b \ A ^ , (2) 

where #i(~ 0(1)) 3> \gi\ ~> . . . are numerical constants taking real values. As we 
see later, this choice of h(wo a , w Ob) respects a chiral symmetry of orientation wheels 
at global scale, namely, the clock-wise and anti-clock-wise wheels appear evenly in 
global average. 

The second step is the updating rule for the vector configuration w = (u>j(r, r)) = 
(w(f,r),wo(r,r)) following 

[UPDT:2] 

< A Wl (f,r)P(^r) > ( = -e(r) 5S[w]/5w t (f,r), 
Awi(r, t) = Wi(f, r + 1) - Wi(f, t). 
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Here < ■ • ■ P(£, r) >§= • • • T ) stands for an average over the given probabil- 
ity function P(£, r) of the presented data £, e(r) (> 0) is the certain (decreasing) 
learning coefficient e(r) | as r — > oo, and S[w] is an energy function that we spec- 
ify in the following. This process stops if the system reaches to a stationary point 
5£[w]/8wi(r,r) = 0. 

The energy function we employ is a perturbed one from what frequently used in 
the retinotopic self-organization problem. Let A? t p* = A -1 exp(— | \r — r*\ \ 2 /2<Jq) be 
a lateral correlation that signifies a short excitatory signal from the winning point 
r* with A being a positive numerical constant, and let a data-profile F^w] at that 
point be defined as in 

[w] = {^= d, Co) e E ® S 1 | Vf e F, d(£, w(r*, r)) < d(& w(f, r))} . (3) 

The latter corresponds to a set of data which a cortical cell at the point r* can handle 
under the given configuration of w. Using these quantities, the energy function £[w] 
is written as: 

£M = |£ A ^ E P(Z,r)(H-w(r,T) \\ 2 + a 2 h^o,w o (r,T))). (4) 

A r,r* ^Tp* [w] 

As we noted in our previous work I, retaining only ^-term [g 2 = g% = • • • = 0) 
in h(^Q,wo{r,T)) reduces this system to the standard Kohonen for the mapping of 
feature space E ® D 2 onto the cortical surface F. 

We consider the time-development of this system under presentation of a data 
set {£,£e>} with the probability P (£, £o;r) = P(^), namely, we use totally random, 
homogeneous inputs with respect to the orientation preference. We also restrict our 
orientation data £o = e 17Tk ^ N (k — 0, 1, . . . , N — 1) for some positive integer N. This 
means in particular that each of our input data is always 100% polarized to one of 
finite possible orientation directions. 

The local symmetry concerned here is unitary group U(l) 

w {f)^^w (f), fc-eW^o (5) 

instead of Z 2 appeared in our previous work I. In this equation 7(f) is an arbitrary 
real function of r. Our energy function (f|) is easily seen to be form-invariant under 
the transformation (|5|). 
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3.2 The picture for the ground state 

First we would like to recall results from retinotopy problem derived without our 
perturbative term <Tq/i(£o, %(f, r)). For this purpose it is convenient to go over 
to continuum limit of the model in the final convergent phase. We can assume the 
lateral correlation width <jq very small. Then the equation for a stationary state 
5£[w]/5wi = can be approximated as follows. 

At the final convergent stage, the set T?*[w\ is almost uniquely given by one 
point set = {£ = w(f*,T)}. Hence we have 

= |£M = f A r> -, ( Wi {r) -Wi(f*)) P(w(r*)) d 2 w(r*). 



Swi(r) 

Writing f* = f + e, and taking into account the Gaussian factor A^f* of small width 
o"o, we obtain in the limit do — > 

= d 2 Wi {f) + 2 {P(w(r}) ■ Jiwir))}' 1 d p {P(w(f)) ■ J(w(r))} d pWi {f). (6) 

Here J(w(f)) is a Jacobian 

J(w(f)) = e aP d a w\r) d p w 2 {f) 

with anti-symmetric symbol e 12 = —e 21 = 1, arising from the change of integration 
variables d 2 w(f*) — > d 2 e. The stationary state has to satisfy the equation (|]). 

It is pointed out (Ritter and Schulten, 1986) that for the retinotopic organization 
an appropriate solution is given in terms of the harmonic form satisfying 

d 2 Wi(r) = 0. (7) 

In fact given a probability distribution on the retinal data 

P(i(r)) oc J-\w{r)) 

one can get a logarithmic mapping relation that solves the equations (|j), (^), 

r 1 + ir 2 = log(W(f) ± iw 2 {r) + a), (9) 

between the visual cortex and the retinal surfaces. This mapping, from half disk 
region of w to upper half plane of r, gives a crude approximation to the observed 
results. In equation (§) the choice of the sign ± depends on which ocular (left or 
right eye) information one considering, and a is the constant of length scale of fovea 
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region. We presume that the probability distribution (|8|) is provided via function 
w(f) satisfying @ at the start of the iteration of the learning steps. 

Next we consider the case with the perturbative term cr /i(£e>, w o(r, t)) included, 
and restrict our attention to one hypercolumn area only — that is about 5 to 10 
times larger than the scale of the orientation columns (7q. Then we can still apply 
the similar approximation method as we used in deriving the result @ to the energy 
function (^). In general this time the winning point f* is not one-to-one with w(f*) 
even in the final convergent phase. Instead, the "winner-takes-all" [UPDT:1] ensures 
one-to-one correspondence relation between r* and (w(r*),wo{r*))- Generally, since 
our orientation data take only discrete values £e> = e mk / N (k = 0, 1, . . . , N — 1), 
wo{r*) approaches locally constant functions at the final convergent stage, i.e., |£q — 
w oif*)\ (where local continuity of woif) is assumed). Thus the data-feature 
space points (w(f*),wo{'f^)), in which the first coordinates w(r*) is the same, are 
distinguished by the discrete "labeling" wo(r t ). Then our data profile Tp* consists 
of a set of pair {(w(f*),wo{r*))} at the convergent phase. Using this fact, we can 
expand our energy function @ at lowest order of <Jq as follows: 

AS oc A-V 2 f P{< w >)\J{v(r))\ Udv{r)f + gi{dw {r)f) d 2 f. (10) 

J hypercolumn ^ ' 

This gives an extra energy per hypercolumn. Recall that A is a measure for strength 
of the lateral correlation A^.. To avoid confusion we write <Jov{r) = w(r) — < w > 
instead of w(r) that was used for global retinotopic problem. The global retinotopic 
solution < w > (equation @) and the probability P(< w >) are treated almost 
constant within the hypercolumn region. In the above expression ( |T0| ) we have to be 
careful about sign change of the Jacobian J{v). This is because in comparison to the 
previous case the relation between r* and v(r*) is not one-to-one; the Jacobian could 
vanish and even change sign, giving rise to folding maps (see the followings). 

In order to get a picture for the ground state let us consider the possible situations 
that lower the energy (|T0|). First of all, all terms in the energy function are non- 
negative. The lowest value in the integrand is attained when the Jacobian factor 
J{y{f)) vanishes and the locally constant function wo(r) does not change its value 
1 1 7^ oo there. In 2-dimensional mappings the vanishing Jacobian J(v(r)) = 
occurs in the following cases. 

(i) The case rank[J] = 1 

(i-a) J(v) changes sign in the neighborhood of J(v) = 0. 
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(i-b) J(v) does not change sign in the neighborhood of J(v) = 0. 
(ii) The case rank [J] = 

Here, abusing notations, we write the Jacobian J(v) = det(dvi(r) / drj) = det[J]. 
The case (i) corresponds to line singularities. The equation of the singular line is 
determined by solving J{y) = 0. In detail, the subcase (i-a) indicates that there is 
a fold along that line since changing the sign of the Jacobian implies alteration of 
orientation of the surface. The subcase (i-b) means existence of a sort of degeneracy 



Figure 2. Examples of a folding map (a) and a line degeneracy (b). Slightly separated appearance 
of the leaves in the folding map (a) is for the sake of visibility. These mappings can occur when 
the Jacobian J{v) = det(dvi(f) / dr j) of the mapping v(r) between retinal and the corresponding 
cortical surfaces vanishes. 

of the mapping v{f) along the line, but no folds. The simple example is shown in the 
Figure 2. The sample mapping (v i(f), V2{r)) = {r\,r^) in Figure 2(a) exhibits a fold 
along the line V\ = T\ = 0. Figure 2(6) shows a degenerate line singularities along 
V\ = ri = in the mapping (v±(r), i>a(^) — ( r i, r 2)- 

In our model ( |T0|) the type (i-a) singularity or a fold does not occur unless the 
orientation function wo{r) changes its (locally constant) value when one crosses that 
folding line. This is because the "winner-takes- all" [UPDT:1] prohibits multiple win- 
ning points (w(r*),wo{r*)) = (w(r*),wo(r])), {r* ^ r*) that could otherwise occur 





(a) 



(b) 



c 




Figure 3. Multiple winning points on a folding cover. When one encounters a folding cover of the 
retinal surface, because of the "winner-takes-all" rule each leaf i — i, i + l of the folding cover 
has to have different orientation preference wo(r*)- 
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on leaves i,j of the folded region. (See Figure 3.) To prevent this at least each leaf 
has to be organized to have different labels woift) ^ wq{t*) for i ^ j. Then the 
contribution to our energy function (|T0D along these folding lines is (J = 0) times 
(|<9u>e)| 2 = oo). We encountered similar situation in the previous work I. We can 
employ a similar regularization method in this case to make this product finite. The 
lower energy configuration is such that the total length of the folding lines is allowable 
shortest. However, this argument is based solely on the neighborhood of the folding 
lines. Actually the folding maps are unlikely to occur. This is because, following the 
analysis in the previous work I, locally at quite small scale the lower energy mapping 
is almost identity mapping w(f) ~ r up to conformal rescaling. Unless one feeds 
the system with non-homogeneous data in such a way that a certain localized region 
gets mapped to be twisted with opposite orientation, the folding is quite exceptional 
situation. If it happens, the folding line could be observed like "fractures" . 

In contrast, the type (i-b) singularity is rather harmless. It can occur basically 
anywhere. In practice in computer simulations (or in real neuronal circuits) this type 
of line singularity, which manifests itself as thin accumulations of points onto a line 
in continuum theory, will not have observable effect, since the system is discretized. 
Unless the lattice spacing (or resolution) is quite small, the effect will be negligible. 
If it produces observable accumulation (or a certain area fails to grow and collapses 
to a line by some reason), then this line would also resemble the "fracture". 

However, it if occurs at the boundary of iso-orientation region (|9w7o| = oo), it 
contributes to the energy function in the same way as in the previous case (i-a). In 
fact, this case is precisely the situation at the boundary of each ocular domains in 
Figure 2(a) of I. There we described the regularized value of the product {J{w) = 
0) ■ (| Owq | = oo) as kink energy. The contribution of this to the energy fllCf ) gets 
lowered if the total length of the iso-orientation boundary is allowable, shortest. 

Here we should also point out that in the ground state the iso-orientation bound- 
ary |<9u>e>| = oo only occurs at those points where the Jacobian vanishes J(v) = 0. 



(Otherwise, the energy (|T0|) diverges and the corresponding probability distribution 
vanishes.) Therefore, the iso-orientation boundary typically does not have unneces- 
sarily wiggly shape, rather it is close to geodesic lines. 

Finally, we have the case (ii) rank[J]=0. In this case the condition J(v) = deter- 
mines a point, leading us to a point-singularity or an intersection of line-singularities. 
Before discussing the property of the point-singularity, we briefly consider the prop- 
erty of non-singular region J(v) ^ 0. There too(r) has to be almost constant 
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(dwo ~ 0) in the ground state. Then the remaining first term in energy fllCf ) exhibits 
the same dynamics as the retinotopy problem analyzed before. Since the probability 
P{< w >) is constant here, we can reasonably assume so is J(v), and v(f) is locally 
harmonic d 2 v = 0. This could only be violated near the singular lines and points 
where J(v) approaches 0. 

Generally, using complex coordinates z = r 1 +ir 2 , $ = v 1 +iv 2 , the harmonic func- 
tion here can be written in a linear combination of holomorphic and anti-holomorphic 
functions: 



Since at global scale the mapping is given in terms of an holomorphic (or anti- 
holomorphic) functions @, let us assume for the moment that this property persists 
at hypercolumn scale. Namely, we consider as some sort of analytic continuation 
from global retinotopic function (Q). This assumption is consistent with the mapping 
without folds. 

In 2-dimensions the holomorphic mapping is (locally) a conformal mapping. With 
an appropriate coordinate transformation, the point singularities (of the type we are 
considering) can always be put into a standard form, e.g, near the singular point at 
z Q it reads 



Here the b(z ) (> 1) is called a ramification index at that point. (If b(zo) = 1, it 
represents a regular point J(v(zq)) ^ 0.) This defines a branched covering of the 
retinal surface near the point v(zq)\ on the corresponding cortical surface the b(zo) 
sheets of the covering come together in fan-shape. By the same reason in the case 
of the folding maps (cf. Figure 3), the "winner-takes- all" ensures a different labeling 
wo(z) on each covering sheet in the ground state. (Therefore, the ramification index 
b(zo) is smaller than possible orientation specifications N, i.e. b(zo) < N.) Thus 
there should exist singular lines (J(v) = 0) of type (i-b) near this system. 

The associated feature-space function wq{z) is determined by minimizing the 
energy ([TOf) near that point. Standard variation with respect to wq(z) gives the 
equation: 



d z d z w (z) + J{v) 1 d z J(v) ■ d z w {z) + J{v) 1 d z J(v) ■ d z w (z) = 0. 
Substituting (JTJ) into J{v) = \d z $(z)\ 2 - \d z $(z)\ 2 , we have 



$ = $„(*)+ $1(2). 



$(tf(r)) = (z-z ) b{zo \ 



(11) 



d z d z w (z) + 



b- 1 ^ 



d z w (z) + 



6-1 



d z w (z) = 0, 



(12) 



z 



z 
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where we put z = 0, b(z ) = b for simplicity. Since Wo{z) is the locally constant 
function, this is always satisfied in almost all the places except near singular points 
J(v) = 0. (See Figure 4.) As stated above each covering-sheet has different label 
wq = e mki ' N , ki 7^ kj. Since the distance between the two points in the feature space 



lUn = e 



w = 



ink 2 /N 



Figure 4. Labeling of a multiple covering (unfolded view). As in the case of folding covers (Figure 3) 
the "winner-takes-all" rule enforces different labeling wq = e lnki / N , k t ^ kj, for each covering- 
sheet of retinal surface in general multiple coverings. 

is \e mki / N — e mkj ^ N \ = 2sin[7r(fcj — kj)/2N], the neighbouring labels ki and k i± i is 
expected to take close values under the [UPDT:2]. In the case N — > oo and the 
ramification index b — > oo (recall N > b) where the each covering sheet looks thin 
fan-shaped region on the cortical surface, each fcj approaches continuous function with 
respect to the index i, — or equivalently the the function of arg[^]. 

In fact, one can solve equation ( |T2D generally as follows. Since equation ( 12 ) can 
be rewritten as 

<m(^-v~v^)) = (&-i) 2 (^rv -1 z 6 ~v>o*)) 

with d z d z being simply a 2-dimensional Laplacian, we can write down its general 
solution as a linear combination of products of azimuthal functions e 2n ^ = {z/z) 1 
and radial ones / 7 (p) (where p 2 = zz) for various spectra 7. Explicitly we have 

w (z) = r d 7 A(i) (-Y- (zzr^V^^, (13) 

J —00 \z J 

where A (7) is an unknown coefficient function (that would be determined by solving 
global system including ocular terms in the energy function). In this equation we 
have discarded the other possible term (z / z) 1 (zz)^^ 1 ^^ 12 ^^ 2 , that is divergent 
as z — > 0, since |w7e>(z)| < 1 by definition. Among the expansion terms in equation 
fll3f) , only one term is dominant at z m 0; that is given by 

wo{z) - A( 7|min| ) (|) Vin| • {zzyU^l (14) 
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Here 7| m m| is the minimum positive or maximum negative value of 7 (whichever the 
absolute value is smaller) for which the expansion coefficient function A (7) does not 
vanish. We also took a leading term in b — > 00 limit. Depending on the signature 
of 7|min|, wo(z) gives clock-wise or anti-clock-wise configuration of the orientation 
detector cells. [If A(0) 7^ 0, then 7| m i n | = 0, i.e., the feature function wo{z) is 
constant. Following preceding argument below equation ([T2f) , this is not the case, 



however.] Therefore, we conclude generally that in association to singular points 
~ z b we have clock-wise or anti-clock-wise orientation wheels, and that each 
orientation module organized around the singular point is one of 6(zo)-tuple covering 
of the local area of the retina. Since in equation (|i~4"D Wq{z) is a continuum limit 
N — > 00 of Wo = e mk ^ N , the 7| m i n | has to satisfy — 1/4 < 7| min | < 1/4. 

Now what so remarkable about working in (piecewise) holomorphic environment 
is that there exist a celebrated topological formula by Riemann and Hurwitz that 
relates the ramification indices b(z) to the Euler characteristics of the surfaces under 
consideration. Specifically, let x(E) an d x(F) be the Euler characteristics of the 
surfaces E and F, and b(z) be the ramification index at z G F associated to the 
covering $(2) : F — > E, then it states the relation 

X {F) = n- X (E) - Y,(Kz) -!)■ (15) 

In this equation n is the mapping degree (or the number of the total coverings) of the 
map $(2), i.e., 

n = J2 h ^ z ) 

for an (arbitrary) point Wq in E. (For an accessible exposition on this formula, see, 
e.g., Griffiths, P. and Harris, J., Principles of Algebraic Geometry, John Wiley & Sons, 
pp. 216-218 (1978).) In the case at hand E and F are small square regions of retina 
and visual cortex, respectively, and we can take x{E) = 1 and x{F) ~ number of 
blocks of continuously changing orientation area (see next section). Plainly speaking, 
here the mapping degree is just the total number of orientation modules within one 
hypercolumn. Since the ramification index equals one at almost all the points on the 
cortical surface F (i.e., ordinary points), the summation J2 z gf(K z ) — 1) in equation 
flT5| ) is actually a finite sum over singular points to which orientation modules come 
together. The most important thing that equation flT5p tells us is the "existence" 
of singularities, namely, since the Euler characteristics are stated as above, as long 
as the number of orientation module is larger than that of blocks of continuously 
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changing orientation area within one hypercolumn (n > x(F)), there must exist 
points with non-trivial ramification index b(z) > 1. Following the preceding analysis 
on the associated feature space function wo{z), that would imply the existence of the 
orientation wheels. 

Unfortunately, global configuration of these singular points are beyond the scope 
of this formula. However, we can consider statistical distribution function (a partition 
function) of possible mixed branched- and unbranched-coverings of retinal surfaces, 
and write them down as a function of number of singular points. We will do this in 
the next section. 

To summarize, we have discussed various possible situations that lower the energy 
(|T0|). Semi-global structure at the scale of hypercolumn size can be sketched as follows. 
Although there is a difference in 2-dimensional Gaussian factor Aj? p» here and the 1- 
dimensional one used in the previous calculation in I, basic organization-structure in 
arbitrary one-dimensional cross-section of the present case will bear resemblance to 
the kink-anti-kink configuration (Figure 2(a) of I). Namely, dominance by only one 
feature (i.e., wq{z) = constant) over a broad range of the cortical surface is unlikely 
to occur. This includes the case wq{z) = where there is no orientation preference, 
since these configurations merely raise the total energy by the same reason as in 1- 
dimensional case, i.e., the non-trivial contribution from the second term in energy (^) 
gives large value proportional to the area of such regions. Instead the lower energy 
configuration is such that the various locally constant values wo(z) = e 17rkj ' N are 
distributed over the surface in such a way that their boundaries (where \dwo\ = oo) 
become (allowable) shortest and consistent with the distributed orientation wheels. 
The reason for this is that the boundaries contribute finite positive energy after the 
regularization of the product with Jacobian (J(v) = 0) • (|9wo| 2 = oo), which is the 
same situation as in I where the corresponding product was referred to as kink energy. 

We also discussed the structure of point singularities in detail, and how orienta- 
tion wheels (both clock-wise and anti-clock-wise) occur in association with the point 
singularities of the Jacobian J(v) = 0. The monotonic (increasing or decreasing) 
change of the orientation preference angle was also shown to take place as locally 
energy-minimizing configurations. Most remarkable is the number of singularities 
that was found to follow the celebrated Riemann-Hurwitz formula. 
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4 Partition function 

In this section we would like to estimate statistical distribution function of our model 
associated to the lower energy configurations described in the previous section. This 
corresponds to evaluation of path integral (see, e.g., our previous work I) 

Z ~ Jvv e- A£ N 

~ ^2 [multiplicity of the covering map v] e~ A£ ^ 

V 

for the dominant contributions. As we discussed before the covering map here is 
specified by its mapping degree n = N, the total number of branch points J, their 
positions Zi, and associated ramification indices b(zi). Most of this section is devoted 
to the counting problem of the multiplicities of these mappings for the given data 
(N,I,b( Zi )). 



4.1 Unbranched and branched coverings of retinal surface 

Although in this paper our main concern is the covering maps of an area of retina, 
that is topologically equivalent to a disk, we first start with an unbranched covering 
of a torus for its illustrative purpose. We write a plaquette for a torus, assuming 
identification (Figure 5(a)) of like-labeled edges in each side. An unbranched N- 
sheeted covering is a collection of N such objects glued together (Figure 5(6)) along 
like-labeled edges. There are many varieties to construct this type of covering surfaces 
(including ones with disconnected components). In order to cover the original torus, 
one identifies each horizontal edges, and then wraps up the torus with the resulting 

» I )) I )) I » 

» » » » 



)) )) )) »■ 



» 1 » 1 » 1 »■ 

(/,) (b) 



Figure 5. An unbranched covering of a torus. We write a plaquette for a torus, assuming iden- 
tification of like-labeled edges in each side (Figure (a)). An unbranched iV-sheeted covering 
(Figure (&)) is a collection of N such objects glued together along like-labeled edges. Then one 
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identifies each horizontal edges, and wraps up the torus with the resulting cylinder by putting 
one end of the cylinder in the other end until the circumferences match up with the original 
torus (Figure (c)). 




(c) 

Figure 5. (cont'd). 

cylinder by putting one end of the cylinder in the other end until the circumferences 
match up with the original torus (Figure 5(c)). In this way, every point of the original 



torus are covered precisely N times. The Riemann-Hurwitz formula (|15l) is trivially 
satisfied. 

Next we consider a branched covering of a sphere {S 2 ). We start with a 2- 
sheeted unbranched covering of S 2 . Just as in previous case the unbranched covering 
corresponds to a simple double wrapping of S 2 ; every points are covered twice in this 
covering. We name the covering sheets "1" and "2" . To make from this a branched 
covering, we pick two points (A, B) on the sphere and cut the covering sheets along 
the straight line connecting these two points (Figure 6(a)). Then we identify one 
edge of the sheet "1" with the edge of the sheet "2" on the other side of the cut and 
the remaining edge of the sheet "1" with that of the sheet "2", i.e., identification of 
the like-labeled edges p, q in Figure 6(a). The resulting covering is branched; the two 
points A, B we picked are branch points (of index 2). 

To see its topologically equivalent surface, we peel off the outer sphere, and paste 
it back along the edges p, q as prescribed (Figure 6(6)). Then we end up with an object 
in sphere topology. The Riemann-Hurwitz formula ( ]L5|) is satisfied since the Euler 
indices x{E) = x{F) = 2 for the sphere, n = 2 (double covering), and the ramification 
indices b = 2 at both branch points (A, B). [In the case of simple unbranched covering 
the Euler characteristic of the covering surface is just the multiple of the original one 
by the number of coverings.] 
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Figure 6. A branched covering of a sphere. We start with a 2-sheeted unbranched covering of S 2 , 
and name the covering sheets "1" and "2". To make from this a branched covering, we pick 
two points (A, B) on the sphere and cut the covering sheets along the straight line connecting 
these two points (Figure (a)). Then we identify one edge of the sheet "1" with the edge of 
the sheet "2" on the other side of the cut and the remaining edge of the sheet "1" with that 
of the sheet "2", i.e., we identify the like-labeled edges p,q in the Figure (a). The resulting 
covering is branched; the two points A, B we picked are branch points (of index 2). To see its 
topologically equivalent surface, we peel off the outer sphere, and paste it back along the edges 
p, q as prescribed (Figure (&)). 

An example of the corresponding analytic expression for this covering is the 
mapping z h- > w = z 2 (& C U {oo}), where the singularities (of the ramification index 
2) occur at the north (z = oo) and south (z = 0) poles of the (stereo-graphically 
projected) sphere. 

The branch points on this sphere are characterized as follows. We draw a small 
circle on the sphere surrounding one of the branch point. Obviously, going around 
once on the circle takes one from the covering-sheet "1" (or "2") to "2" (or "1"), and 
a repetition of this process brings one back to the starting position. The same is true 
with regard to the other branch point. One can think of this process as an action of 
transpositional element (12) in the symmetric group S 2 on the labels "1, 2" of the 
covering-sheets. 

For the case 3-sheeted branched covering with two branch points, we can repeat 
the similar construction, using identification rules as shown in Figure 7(a). The 
Figure shows the sectional view from the north pole. It is easy to see that the 
resulting branched cover has again sphere topology. An example of the corresponding 
analytic mapping is z i— > w = z 3 with the ramification indices 6(0) = o(oo) = 3. In 
this case, the branch points are characterized by the permutational element (123) of 
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symmetric group £3. Figure 7(b) shows another gluing rule in which mixed branched 
and unbranched coverings occur. This time the branch points are characterized by the 
transposition (12) (3) of the 1S3. In Figure 7(c) in another construction we separated 
out the two branch points with ramification indices 3 in Figure 7(a) into four branch 
points with ramification indices 2. [The total branching number is invariant in this 
process: (3 — 1) x 2 = (2 — 1) x 4.] The new branch points are characterized by 
(12) (3) and (1)(23) of the symmetric group S 3 , respectively, associated to each gluing 
edge. 




(a) (6) (c) 



Figure 7. 3-sheeted branched-coverings of a sphere. For the case 3-sheeted branched covering 
with two branch points, we can repeat our construction, using identification rules as shown in 
Figure (a). The Figure shows the sectional view from the north pole. Figure (6) shows another 
gluing rule in which mixed branched and unbranched coverings occur. In Figure (c) in another 
construction we separated out the two branch points with ramification indices 3 in Figure (a) 
into four branch points with ramification indices 2. 

One can generalize this construction, and create iV-sheeted branched (and un- 
branched) coverings with / branch points by associating permutation elements 7Tj, i = 
1, . . . ,1 in symmetric group Sn to each branch points Zj on the sphere. The choice 
of the permutations 7Tj's is basically arbitrary except one constraint to be satisfied by 
VTi's: 

7Ti7r 2 • • • 7Tf = 1. (16) 

The reason for this is the following (e.g., Hurwitz 1891). We draw small circles q 
around every branch points Zj, and go around them once starting from an arbitrary 
non-branch point P on the sphere (Figure 8(a)). This gives the left-hand side of 
the constraint equation fll6"l) . Since we are on the sphere, we can deform this loop 
(without crossing any branch points) into another loop with the opposite orientation 
that surrounds small region near the starting point P (Figure 8(6)). Since the point 
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P is a regular point, this gives the right-hand side of the equation (|T6|). One obtains 
different coverings according to the various different combinations of 71$ 's. 

The number of independent coverings created in this way can be counted as 



where J2 ' stands for the sum avoiding double counting, and 5(p) is Kronecker's delta 

S(p) = 



1 if p = 1 
otherwise. 



We will specify the counting rule ' in our specific problem later. 





to 



(b) 



Figure 8. Geometric meaning of the constraint equation (jig). We draw small circles Cj around 
every branch points zi , and go around them once starting from an arbitrary non-branch point P 
on the sphere (Figure (a)). This gives the left-hand side of the constraint equation (|l6|). Since 
we are on the sphere, we can deform this loop (without crossing any branch points) into another 
loop with the opposite orientation that surrounds small region near the starting point P (Figure 
(6)). Since the point P is a regular point, this gives the right-hand side of the equation (|ici|). 

Now we would like to consider a similar construction for coverings of a plaquette 
that is needed in this paper. In this case we have to take into account the coverings of 
the boundary. We assume that our covering sheet (s) have m(< N) non- intersecting, 
mutually independent boundaries Bf. {k — 1, . . . , m), each homeomorphic to a circle 
S 1 , and that the boundary of the original plaquette is covered by these boundaries 
B}., each Lk times, for some non-negative integer Lk, {L\ + L-2 + • • • + L m = N). 
[Namely, is the L^-fold cover of the boundary of the original plaquette.] 
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In the similar way as we specified branch points we can specify this bound- 
ary covering in terms of an element a G Sn that consists of m cycles of lengths 
L X ,L 2 , . . . ,L m , i.e., a = (l x i 2 ■ ■ ■ t Lx )(m x m 2 ■ ■ ■m L2 )( •■• Xftfe- • -qO- If one S oes 
around the boundary of the original plaquette once in anti-clock-wise direction, one 
moves on the boundary circles in covering space in a way as specified by a. 

With these preparations we can now construct a branched covering of a plaquette 
in exactly the same way as in the sphere case. However, here the constraint (|T6|) is 
replaced by the following one: 



The reason is obvious by construction; we just repeat the discussion following equation 
(|T6| ) with the sphere replaced by a plaquette. 

We get various mixed branched (and unbranched) coverings under different com- 
binations of 7Tj's and a in the symmetric group S^. 

In order for this construction to be applicable to our specific problem of orienta- 
tion modules, we must restrict possible forms of 7Tj's and a. Recall that every orien- 
tation modules are labeled by the locally constant function wo = exp(inki/N), ki = 
1, . . . , N, and they constitute an iV-sheeted covering of an area of retinal surface. 
Since, as discussed in the previous section, the neighbouring modules have similar 
orientation specifications ki, first we must have the boundary a as in 



a=(12---L i y^L 1 + l,---,L 1 + L 2 y% ■•• )(N - L m + 1, • • • , N) e ™, (18) 



where q = ±1 (i = 1,2, ... ,m). Secondly, in order to prevent double counting 
we restrict all 7Tj to be transpositions (ki) G «Sjv- This is because as we saw in 
Figure 7 non-transpositional gluing rule can always be considered as a special case 
of transpositional gluings where some of the branch points of the ramification indices 
2 accidentally come close by. Furthermore in this case we must restrict them to the 
generators of the symmetric group (k k+1), k = 1, 2, . . . , N—l, and (N 1). The reason 
for this is locally the neighbouring modules have similar orientation specifications 
ki±i = h ± 1. 

Putting all these together, we end up with the expression for the partition function 
of our orientation modules of module number iV per hypercolumn: 



-l 



(17) 



7l\7l2 ■ ■ ■ 7TJ = O 




(19) 
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Here the summation Yl,'aneS N implies that we restrict 7Tj to generators [k k + 1), k = 
1, 2, • • • , N — 1, and (N 1) only, and a as in equation fll8D while avoiding any double 
counting. The energy A£[V] is a function of singular points Zj and the boundary 
condition specified by er.0 

4.2 Most probable modules 

Although we are interested in the case where the total module number N (per hyper- 
column) is large, let us first consider, as an example, the case N = 5. First of all, since 
the cortical surface F has only simple topology, as a covering space its Euler char- 
acteristic x{F) h as to be a positive integer no larger than N; x(F) = 1, 2, 3, . . . , iV. 
The case x{F) = N corresponds to N unbranched (disconnected) coverings, and 
x{F) < N if the covering is of mixed branched and unbranched one. From the 
Riemann-Hurwitz formula fll5|) this restricts the number of (transpositional) singu- 
larities I — 0, 1, 2, . . . , N — 1. In the case N = 5 this implies I = 0, 1, . . . , 4, and 
we can easily evaluate possible number of mappings 'K^i ' ' ' 7r / cr ) (Table 1). For 
the boundary condition o = 1 we get 5(7ri7r 2 ) = 5, Y, ^(^i^^^) = 15, and etc. 
Intuitively these surfaces correspond to coverings connected with double arcs (see 
Figure 9(b)). In Figure 9(a) we start with (N =) 5 unbranched covering-sheets 




Figure 9. Coverings of a plaquette. In Figure (a) we start with N unbranched covering-sheets 
of the original plaquette. In Figure (b), we connected the covering-sheets "1" and "2" along a 
line connecting two singularities A, B, via tta = = (12), using a similar rule as explained in 
Figure 6 in the sphere case. The line connecting the two sheets is called a "double arc" . For the 
case with the non-trivial boundary condition a = cr(j) = + 1) the intuitive picture of the 

*The quantity A[u] is a Jacobian arising from change of functional integration variable T>v — > d 2 z, 
in collective coordinate method. 
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covering is just like the double arcs stated above except for one singular point that is anchored 
to a = + 1) on the boundary (Figure (c)). The latter comprises a branch cut along the line 
connecting them. 

of the original plaquette. In Figure 9(6), we connected the sheets "1" and "2" along 
a line connecting two singularities A, B, via ~ka — ^b — (12), using a similar rule as 
explained in Figure 6 in the sphere case. The line connecting the two surface is called 
a "double arc". Every term of the type (i, i + 1) • • • (i, i + 1) appearing in 7177^ ■ • -717 
represents such a double arc connecting a pair of covering-sheets "i" and "i + 1". 

For the case the boundary condition o = a(j) = (j, j+1), we get Ej E 5(717 °"(j)) — 
5, Ej E 5(717712713 a(j)) = 15, and etc. In this case the intuitive picture is just like the 
double arcs stated above except for one singular point that is anchored to a — (j, 
on the boundary (Figure 9(c)). The latter comprises a branch cut along the line con- 
necting them. The pairs of transpositions that do not contract with a compose the 
double arcs as before. 

In general we can locate a anywhere at the boundary. As in the case of singu- 
larities Ti j we consider the component-decomposition of a in terms of transpositional 
elements + 1). We assume that these components of a are distributed separately 
over the boundary in the same order appearing in the transpositional component 
decomposition in minimal length. The summation J2<Tts N ^ us gi yes r i se to factors 
of C(= length of the boundary in the unit of a (— hypercolumn size)) in association 
with each transpositional component + 1). 

Table 1. Combinatorial factors for composing (N =) 5-fold covering of a small segment of the retinal 
surface corresponding to one hypercolumn area on the visual cortex, g = (Aoe~ A£ °/ x 



I 


12 


3 


4 


£' 5(71, • • • TTj) 


1 5A 2 




15^ 4 


Ej E' <5(7Ti • • • 7T 7 + 1)) 


5g 


15gA 2 




E*E'<H7n---7r 7 (M + M + 2)) 


log 2 




20g 2 A 2 


Ei E <H7ri ■ • ■ 7T/ (i, i + 1, i + 2, i + 3)) 








EcyclesE' 5(71! •• -71, (12345)) 






10(? 4 


EpairsE'5(7Ti---7r / (12)(34)) 






10g 2 A 2 


E P airsE'5(7Ti---7r / (l23)(45)) 




10g 3 





Local organization of orientation columns 



27 



In the path integral ([[£]) the integration / Y[i=i d 2 Zi also gives rise to factors 
of v4o(= area of the covered plaquette in the unit of ctq) in association with the 
positional degrees of freedom of the (transpositional) singularities. [For the complete 
handling of these factors (proportional to the boundary length and area), one actually 
has to solve stationary point equation associated to (f|), and to integrate over all 
the positions of the singularities and the boundary conditions; the integrand e~ A£ 
depends on those (so-called moduli) parameters in general. The actual integration 
over the moduli generally gives slightly different contributions rather than the simple 
multiplications of factors of the area and the boundary length of the plaquette as 
above. However, here in order to get rough estimation of the path integral we ignore 
those parameter dependence of the integrand e~ A£ (AS = A£ /X, cf. equation (|H])), 
and we simply multiply the factors of Ao and ( to its representative value e~ A£ °^ x 
per (transpositional) singularity + 1).] We combine two factors Ao and e~ A£ °/ x , 
and write A = Aoe~ A£ °/ x . In Table 1 instead of the variables £, A we have used g(= 
(A), A. In the latter parametrization the power of g equals (minimal) transpositional 
component number of the boundary condition a, and that of A represents twice the 
number of double arcs in the resulting coverings. 

From the Table 1 and the definition of A = Aoe~ A£ °/ x one can immediately 
conclude that appropriate choice of the lateral correlation strength A in our model 
suppresses (i.e., A < 1) double-arc contributions. We assume such choice of A in 
the following, since that is close to real situation. Then among the resulting A- 
independent terms the maximum contribution is obtained from the term with the 
boundary condition o = (12345)-type if g > 1, and from other cr's if g < 1. Since 
Ao is of order 1 and ( ~ 24y / 3,[] (unless the measure A for the lateral correlation 
strength is too small) the former boundary condition a = (12345)-type is realized. 
Hence one gets one orientation- wheel with all orientation directions organized to it. 
(We assume in the global scale that every transpositional components (separated as 
a convention for the counting multiplicities) are organized in such a way that each 
boundary condition a (with m cycles) corresponds to global m orientation- wheel(s).) 

The next largest contribution is from those terms with a = (1234)-type or a = 
(123)(45)-type, namely, singularities compose either one wheel with fewer orientation- 
modules or two orientation wheels with three- and two-orientation specifications as 
in (123) and (45). (When g becomes smaller than 1, the boundary conditions other 



^The ocular-dominance width ~ from our previous work (I) has been used. 
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than o = (12345)-type give dominant contributions. For the detailed classification of 
this case we would need more precise evaluation of each term in Table 1.) 



Table 2. Approximate combinatorial factors for the case that the covering multiplicity N of a 
retinal surface is large. (L = l\ + £2 + • • ■ + 4n — m) 



I 


1 2 


3 


■■ L ■■ 


N 


a = 


1 


1 








a = 
a = 
a = 


(M + i) 

(i,i + l,i + 2) 

(i,i + l,i + 2,i + 3) 


Ng 

2Ng 2 


2Ng 3 






a = 
a = 
a = 


(123 ■ ■ ■ N) 

(12)(34) 

(12)(34)(56) 


N 2 g 2 /2 


N 3 g 3 /3\ 




2Ng N - 1 


a = 


{£i,£ 2 , ...,£ m ;n} 











Now we turn to the case N large (Table 2)|. Basic structure of the combinatorial 
factors remains same as in = 5 case. Since we are not interested in the coverings 
with multiple double- arcs, we omit the terms with explicit A dependence in Table 2. 
For this purpose we have assumed the appropriate choice of the strength measure A of 
the lateral correlation A^* (so that A = Aoe~ A£o ^ x < 1). We also write the boundary 
condition a as {£1, £2, . . . , £ m ; n} that consists of cycles of lengths £±, £2, ■ ■ ■ , £ m among 
which n cycles are non-transpositional. Approximate combinatorial factor Q a for the 
general a = {£\, £ 2 , ■ ■ ■ , £ m ; n} is given as 



m— 1 



Q a « 2 n C a Ng L J] 1 



k=l 



N - L -m+1 

k 



(20) 



where C a is a symmetry factor, e.g., C a = 1/m if all the cycle-length £ { are equal, 
C a = 1/3 if cr = {£1, £2, £1, £2, £1, £2] n}, etc., and C a = 1 if there is no symmetry in 
the configurations of £j's, and L = J2iLi{£i — 1)- 

A larger combinatorial factor is obtained if n is large (= m) and a is non- 
symmetric (so that C a = 1). Restricting a to such cases, we evaluate the factor 



■fin the table we write only representative boundary conditions a in each row; the summation 
with respect to i or possible er's that fall into the same class as the representative a is assumed. 
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Q a in the decreasing order of the power g . In general there are several configu- 
rations <7 that have the same power g L . For instance, in the case L = N — 4 we 
have a = {£ u £ 2 , £ 3 , N - £f =1 £ t ; 4}, {£,, £ 2 , N - 1 - £?=i ; 3}, {^i, N — 2 — £ ± ; 2}, 
and {iV — 3; 1} (, assuming l x ^ £ 2 ^ £3 > 2, etc.). The first a represents four 
orientation-wheels with all orientation modules belong to one of the four wheels, 
whereas the remaining a's represent three, two, or one orientation- wheel (s) with one, 
two, or three orientation modules disintegrated from the organized wheel(s). The cor- 
responding combinatorial factors are given as Q a = QANg N ~ 4 , A8Ng N ~ 4 , 16Ng N ~ 4 , 
and 2Ng N ~ 4 , respectively. Thus the maximal factor Q a is attained in the first a. In 
Table 3 we list only those maximal configurations a together with their combinatorial 
factors Q a for each g L . 

Table 3. Maximal combinatorial factor Q a is evaluated for each covering multiplicity N of a small 
segment of retinal surface. The table shows the values of the factor for various possible coverings 
specified by "ct" . 



a 


Qa 


{N;l} 

{£1, N — £1, 2} 

{4,*2,W-£?=i4;3} 
{4,42,4,W-£i=i4;4} 


2Ng N ~ 1 
8Ng N ~ 2 
2ANg N - 3 
QANg N ~ 4 


{£i,£2,£ 3 ,N-l-E 3 l=1 ^-A} 

{4,42,...,*4,W-£? =1 4;5} 


160Ng N ~ 5 


{£i,£2,...,h,N-i-EUi^ 

{£i,£2,...,h,N-l-EUti 


5} 
6} 


A80Ng N ~ 6 
13AANg N ~ 7 


{£i,£2,-..,h,N-2-ZLi^ 
K,£ 2 ,...,4,iV-i-ELi^ 


6} 
7} 


358ANg N - s 


{£ 1 ,£ 2 ,...,4,iV-2-£ti^ 
{£i,£ 2 ,...,£7,N-2-Y: 7 i= iti 


7} 
8} 


10752Ng N ~ 9 
30720Ng N - 10 


{£i,£2,-..,£ 7 ,N-3-El 1 ^ 

{ti,t2,...,ta,N-2-Y* =i e i 


8} 
9} 


8AA80Ng N - u 







From Table 3 one can immediately see that the first a = {N; 1} gets maximal 
Q a if g > A. The second boundary condition a = {£, N — £; 2} becomes maximal 
if 4 > g > 3. In other words, "all orientations organized into one wheel" becomes 
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dominant for g > 4, and u £ and N — £ orientation-modules organized respectively 
into independent wheels" is dominant if 4 > g > 3. One can generally prove this fact 
including all lower power g^-terms as in the followings. 

Let a* = {£i, £2, ■ ■ • , N — (N — L — m) — Y^l 1 &i j m } be a maximal configuration 
of (yf L -terms (with N — L — m > being the number of disintegrated modules from 
the wheels). Then using equation (p0|) and Q a * > Q a for given N, L, one gets 



3m -2 >2(N-L) > 3m - 5, N - L > m. (21) 

This implies that under the change L — > L — 1 the m-value m^ associated to the 
maximal configuration is restricted as 

1 5 
m L - - < m L _i <m L + -. (22) 

Thus we have either m^_i = m^ or m^ + l. In Table 3 both cases are realized. At L = 
N — 5, N — 8, iV — 11, . . . there arise two maximal configurations <r* = {£1, £2, ■ ■ ■ , N — 
(iV-L-m)-^^ 1 ^;^} and a 2 * = 4, • • • , N-(N-L-m+l) ~Y^i U \ ^-1} 
that have the same maximal factor Q a * = Q a * . This situation always occurs for such 
(N, L, m) that satisfies 2(N — L) = 3m — 5. (From the restriction N — L > m, equation 
(|21|) , the case L = — 2 is excluded from this argument.) Choosing the appropriate 
configuration cr*(= or cr|) as the maximal configuration for g^-term if necessary, 
one can always consider mi as satisfying mt-i = tul + 1. (See Table 3.) 

Now we consider the condition Q a *\i > Qo-*\l-i under tul-i — m L + 1 between 
neighboring maximal a* at g L - and g L ^ 1 -terms. From the expression (|20|) this condi- 
tion is recast into 

2(N-L + 1) 

9 > , (23) 

m L 

for L < N — 2. However, the right-hand side of this condition is restricted by the 
inequality (^l]) as in 

2(N-L + 1) 3 , s 

3 > > 3 . (24) 

m L m L 

Therefore, one concludes that as long as g > 3 the relation Q a * \ L > Q a * [x— 1 is always 
satisfied for all L(< N — 2), and the configurations a = {£, N — £\ 2} or o = {N; 1} 
get largest combinatorial factors Q a in all possible ex's. Obviously, a = {N; 1} is 
dominant if g > 4, and so is a = {£, N — 1\ 2} if 4 > g(> 3). This completes the proof. 

To summarize we have estimated the partition function ( |l~9|) of our model that 
consists of the sum of frequencies (or probabilities) Q a for the occurrence of various 
(possible) organizations a of the orientation-modules. In the sum the organizations 
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containing M double-arcs (Figure 9(6)) has been shown to have extra A = Aoe~ A£o ^ x 
factor Q a oc g L A M , where Aq, ASq, A are the measures for hypercolumn area in the 
unit of <7q, average energy per orientation module, and a coefficient in the lateral 
correlation A^, respectively. We have suppressed those (higher order) terms by 
appropriate choice of A so that the factor A < 1. Among the remaining non-double- arc 
organizations (Figure 9(c)), we have proved the following remarkable fact: once g(= 
(A) exceeds 3, most probable organizations are "all orientation- modules organized 
into one orient at ion- wheel (if g > 4) or two wheels (if g < 4)". This is remarkable 
since we do not need any fine-tuning of our parameter A into special, definite value 
to obtain this result. Since ((= measure of the boundary length of the hypercolumn 
in the unit of <7 ) is roughly equal to 24^/3 (Yamagishi, 1994), the condition g > 3 
holds under quite natural circumstance. In contrast, more than 4 orientation-wheels 
per hypercolumn are not easily realized without fine-tuning of g (or A- value). Also 
remarkable is the fact that this result is independent of the total module number A" 
as long as A" is large. 



5 Summary 

In this paper we have considered the modular organization where the feature space 
consists of infinite (continuous) degrees of freedom. Specifically we have considered 
the local organization of orientation modules while restricting ourself to a hyper- 
column area. Since we do not know the minimal increment (or resolution) of the 
orientation preference between neighboring modules at low-level visual processing 
(VI), we proposed here statistical definition for orientation-module boundaries. Each 
module size in this definition was considered indefinite and was defined only statis- 
tically. Their size was considered to be selected in the course of higher-level visual 
processing. In the organization of such modules semi-local continuity of the orien- 
tation preference angles was required for the scheme to work well. In our model of 
generalized Kohonen's feature mappings we have successfully realized this scheme. 
We have also shown in this model that the existence of orientation-wheels observed 
in visual cortex is a consequence of the celebrated Riemann-Hurwitz formula from 
(geometric) topology. 

This formula from topology came into the game since in our model the holomor- 
phicity plays important role to determine the stationary points of the net, and in 
two dimensions a holomorphic mapping is (locally) conformal. Hence the topological 
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property has turned out to play essential role. 

Within the scope of the same topological argument we have discussed the most 
adequate number for orientation-wheels per hypercolumn. We have estimated it by 
evaluating the partition function Z N , equation (|l^). The resulting most probable 
orientation-wheel consists of all orientation modules organized monotonically with 
respect to orientation angles if the lateral correlation strength A is chosen in such a 
way that g(= (Aoe~ A£o ^ x , where £ [« 24\/3 from the previous work I] being a measure 
of the boundary- length of the hypercolumn region) exceeds 4 and Aoe~ A£ °^ x < 1. 
Even if that is not the case, if g exceeds 3, we have two wheels consisting of all 
orientations. In contrast, more than four orientation- wheels per hypercolumn are 
hard to realize without fine-tuning of the parameter g. This is remarkable since we 
have the adequate organization of the orientation- wheels independently of the minimal 
increment (« of the orientation angles. This number of orient at ion- wheels is 
also consistent with the observed result on macaque monkeys. This property remains 
true under the change of N as long as iV large without re-tuning the parameter A 
(hence c/-value)[|. 

We have a couple of important issues left untouched in this paper. Basically 
what we have considered here are the things that could be handled with tools only 
from topology. However, for the analysis of the real cortical surface, such as global 
organization of the orientation modules over the broad range, we have to go beyond 
the topology. This is very difficult but important issue. Another thing to be clarified 
is the selection mechanism of the iso-orientation width (or increment) in the organized 
wheels that we assumed as higher-cortical function. We would like to return to these 
issues in the future publications. 

Part of this work was performed under the auspices of the U.S. Department of 
Energy by the Lawrence Livermore National Laboratory under Contract W-7405- 
ENG-48 with the University of California. 



The most of the result of this paper is correct if N is larger than 10, approximately. 
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Figure Legends 

Figure 1. Ice-cube model of orientation columns by Hubel and Wiesel. Orientation 
columns are regularly embedded in R, L ocular dominance area. 

Figure 2. Examples of a folding map (a) and a line degeneracy (ft). Slightly separated 
appearance of the leaves in the folding map (a) is for the sake of visibility. These 
mappings can occur when the Jacobian J{v) = det(dvi(f)/drj) of the mapping 
v(f) between retinal and the corresponding cortical surfaces vanishes. 

Figure 3. Multiple winning points on a folding cover. When one encounters a folding 
cover of the retinal surface, because of the "winner-takes-all" rule each leaf % — 1, 
i, % + 1 of the folding cover has to have different orientation preference w (r*). 

Figure 4. Labeling of a multiple covering (unfolded view). As in the case of folding 
covers (Figure 3) the "winner-takes-all" rule enforces different labeling wq = 
giTrfc./A^ ^ _j_ ^ £ Qr eac jj covering- sheet of retinal surface in general multiple 
coverings. 

Figure 5. An unbranched covering of a torus. We write a plaquette for a torus, 
assuming identification of like-labeled edges in each side (Figure (a)). An un- 
branched iV-sheeted covering (Figure (ft)) is a collection of N such objects glued 
together along like-labeled edges. Then one identifies each horizontal edges, and 
wraps up the torus with the resulting cylinder by putting one end of the cylinder 
in the other end until the circumferences match up with the original torus (Figure 

Figure 6. A branched covering of a sphere. We start with a 2-sheeted unbranched 
covering of S 2 , and name the covering sheets "1" and "2". To make from this a 
branched covering, we pick two points (A, B) on the sphere and cut the covering 
sheets along the straight line connecting these two points (Figure (a)). Then we 
identify one edge of the sheet "1" with the edge of the sheet "2" on the other side 
of the cut and the remaining edge of the sheet "1" with that of the sheet "2", i.e., 
we identify the like-labeled edges p, q in the Figure (a). The resulting covering 
is branched; the two points A, B we picked are branch points (of index 2). To 
see its topologically equivalent surface, we peel off the outer sphere, and paste it 
back along the edges p, q as prescribed (Figure (ft)). 

Figure 7. 3-sheeted branched-coverings of a sphere. For the case 3-sheeted branched 
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covering with two branch points, we can repeat our construction, using identifi- 
cation rules as shown in Figure (a). The Figure shows the sectional view from 
the north pole. Figure (b) shows another gluing rule in which mixed branched 
and unbranched coverings occur. In Figure (c) in another construction we sep- 
arated out the two branch points with ramification indices 3 in Figure (a) into 
four branch points with ramification indices 2. 

Figure 8. Geometric meaning of the constraint equation (|T6|). We draw small circles 
Cj around every branch points Zi, and go around them once starting from an 
arbitrary non-branch point P on the sphere (Figure (a)). This gives the left-hand 
side of the constraint equation (0). Since we are on the sphere, we can deform 
this loop (without crossing any branch points) into another loop with the opposite 
orientation that surrounds small region near the starting point P (Figure (6)). 
Since the point P is a regular point, this gives the right-hand side of the equation 

Figure 9. Coverings of a plaquette. In Figure (a) we start with N unbranched 
covering-sheets of the original plaquette. In Figure (b), we connected the covering- 
sheets "1" and "2" along a line connecting two singularities A, B, via tta — — 
(12), using a similar rule as explained in Figure 6 in the sphere case. The line 
connecting the two sheets is called a "double arc". For the case with the non- 
trivial boundary condition a = <j(j) = + 1) the intuitive picture of the 
covering is just like the double arcs stated above except for one singular point 
that is anchored to a — + 1) on the boundary (Figure (c)). The latter 
comprises a branch cut along the line connecting them. 

[Figures are embedded in the text.] 



